Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

POC for #196 - Relax computed importstr #805

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mbarbero
Copy link

@mbarbero mbarbero commented May 8, 2020

Adds a new builtin function (importstrDyn) that accept a computed string.

I did not modified the importstr logic itself as I don't understand well
enough how the parser is working. Of course, it would better if importstr could be modified to accept any computed string.

With test.jsonnet

{
  a: {
    b: "dyn",
    c: "file-%s.txt" % self.b,
    d: "file-dyn.txt",
    
    r1: importstr "file-dyn.txt",
    r2: std.importstrDyn("file-dyn.txt"),
    r3: std.importstrDyn("file-%s.txt" % self.b),
    r4: std.importstrDyn(self.c),
    r5: std.importstrDyn(self.d),
  },
}

and with file-dyn.txt

>> 42

Executing $ jsonnet test.jsonnet outputs

{
   "a": {
      "b": "dyn",
      "c": "file-dyn.txt",
      "d": "file-dyn.txt",
      "r1": ">> 42",
      "r2": ">> 42",
      "r3": ">> 42",
      "r4": ">> 42",
      "r5": ">> 42"
   }
}

Hopefully, this PR will trigger new discussions and maybe the addition of such a feature. I really don't see how this would diminish the hermiticity (not more than extVar or tla IMO) nor how it would prevents static analysis.

Adds a new builtin function (importstrDyn) that accept a computed string.

I did not modified the importstr logic itself as I don't understand well
enough how the parser is working.
@mbarbero
Copy link
Author

mbarbero commented Jun 9, 2020

Any feedbacks?

@ant31
Copy link

ant31 commented Jun 9, 2020

This is one of my main blocker to not create external commandline tool in order to use std.native function and do the same.

@sbarzowski
Copy link
Collaborator

Thanks for the ping and sorry I didn't reply sooner. This is a long standing discussion. Basically the import mechanism was intended as a way to statically import libraries and split code into multiple packages, not for loading stuff dynamically. The matter is not settled, but I expect the impact on the language to be (slightly) net negative.

Here's my summary of arguments for and against relaxing the restriction.

Pros:

  • Allows creating functions like importJson, importYaml, importIni.
  • Allows reading existing data represented as multiple files more easily.
  • Allows using import and importstr on the same file without duplication (useful for documentation generation).
  • Allows using plain data manifests to configure Jsonnet processing (see Relax computed importstr ? #196 (comment)).
  • ?

Cons:

  • No longer possible to statically determine all the paths that could be imported from a given file. This gets in the way of some current efforts (linter, documentation generator) and some potential ones too.
  • Related to the previous point, it's impossible to audit all accessed files by simply grepping for import / importstr.
  • Encourages programming patterns which make it harder to see where the code is coming from. The imports are intended mostly to be used in the same way as imports in Python, includes in C++ etc. That is, a bunch of statically known imports at the top which refer to other files in the same repository or to dependencies which are provided in JPATH. If we added computed import people would start introducing prefixes, functions which read a bunch of files based on conventions etc.
  • Encourages calculating the absolute path instead of using JPATH for handling dependencies.
  • Enables doing some wildly inappropriate things such as reading a manifest of installed Jsonnet configs from absolute path and importing Jsonnet files based on paths found there.
  • Makes it harder for people to migrate between different importers, because they depend more on the path structure.
  • Makes it harder to add New Better Imports™ one day (e.g. dotted imports with better handling of dependencies-of-dependencies), because people do their own calculation.

The cons are not really deal breakers, but the pros seem like relatively minor convenience things and the alternatives are not so bad.

The most important things if you want to move this forward would be to describe the use cases and why the alternatives are not sufficient (why not specify the paths directly and why not pass the data as TLA).

FYI the implementation on the interpreter side is really not a problem. This is entirely an issue of language design.

This is one of my main blocker to not create external commandline tool in order to use std.native function and do the same.

I'm not sure what you mean. Can you elaborate? Do you mean using Jsonnet as a library and providing a native function there which emulates computed imports? Do you still have the same use case as you mentioned in #196?

@ant31
Copy link

ant31 commented Jun 10, 2020

I'm not sure what you mean. Can you elaborate? Do you mean using Jsonnet as a library and providing a native function there which emulates computed imports?

Yes, using Jsonnet as a library to implement additional methods such as importJson, importYaml.
Several other tools based on Jsonnet are doing the same like bitnami/kubecfg or grafana/tanka.

@sbarzowski
Copy link
Collaborator

sbarzowski commented Jun 10, 2020

Yeah, importYaml stuff is is the most convincing use case here. We're also looking into alternative solutions, e.g. first-class support for loading different types of files with import.

And you can always do things like std.parseJson(importstr 'yourfile.json') which is maybe slightly verbose, but it works.

@ant31
Copy link

ant31 commented Jun 10, 2020

@sbarzowski also I like to 'hide' jsonnet from user when it's not needed and make it like seems like basic json/yaml.

So instead to requires the end-user to know about jsonnet to write the following:

build(
{
config: std.parseJson(importstr './config/myconfig.json')
}
)

I prefer:

build(
{
config_file:  './config/myconfig.json'
}
)

And process the user input in the build function.
This way, everything inside build() is very close to raw json. it is a good way to onboard new users to jsonnet and slowly ramp their knowledge about it.

@mbarbero
Copy link
Author

My use case if the following: I have a 2 phases process. First phase uses a jsonnet program to generate a json file driving an imperative process to generate some files (generated file names comes from some info in the json file). In the second step I generate a json that aggregates some of the files generated during the imperative process. As you may imagine, in the first phase the filename/key of the target filenames are dynamic/configurable (users of the jsonnet program can even add supplemental entries).

phase1.jsonnet

{
  files: {
    file1: <configuration to generate content of file1>,
    file2: <configuration to generate content of file2>,
  }
}

file1 and file2 are generated by an imperative process.

phase2.jsonnet (we use -m to generate files in a different folder)

{
  "file1": "before %s after" % importstr 'file1'
  "file2": "before %s after" % importstr 'file2'
}

phase2_wannabe.jsonnet

local config = import "phase1.jsonnet";
{
  data: {
    [filename]: "before %s after" % (importstr filename) for filename in config.files,
  }
}

An alternative to that would be to do higher order jsonnet (ie generate phase2.jsonnet during phase1), but I feel this would pushing it too much and think that computed importstr would be better / more readable.

Does it help you understand the use case?

PS: the real use case is related to kubernetes secrets generation. Phase1 define what secrets should be deployed and from which vault they should be retrieved. Imperative process reads this config and generates secrets files from there. Phase 2 generates kubernetes secret resources by importstr the content of the secret files.

The kube secret resources could be generated directly in the imperative process, but all other kubernetes resources are generated during phase 2 and we try to do as little as possible during the imperative phase to ease debugging / testing / reproducibility.

@sbarzowski
Copy link
Collaborator

sbarzowski commented Jun 10, 2020

Thanks for providing more context.

Re @ant31:
Trying to hide Jsonnet from the users sounds is a bit of a boundary case from language design perspective. Are you sure importstr is not acceptable? You don't need to wrap it in parseJson in that part of code, so it's just a matter of importstr.

Re @mbarbero:

we try to do as little as possible during the imperative phase to ease debugging / testing / reproducibility

Yeah, good call, that's kind of the point of Jsonnet – keep logic pure, sane and separate from any operational stuff.

An alternative to that would be to do higher order jsonnet (ie generate phase2.jsonnet during phase1), but I feel this would pushing it too much

Agreed. Generating Jsonnet source code with Jsonnet is going to be hard to follow.

Phase 2 generates kubernetes secret resources by importstr the content of the secret files.

How about loading the content of secret files using extVar/tlaVar? Is there some reason why that wouldn't be practical?

@mbarbero
Copy link
Author

mbarbero commented Jun 11, 2020

@sbarzowski Thanks for your efforts in identifying our precise needs.

Phase 2 generates kubernetes secret resources by importstr the content of the secret files.

How about loading the content of secret files using extVar/tlaVar? Is there some reason why that wouldn't be practical?

We're talking dozens of files here. It would create humongous command line or function parameters list. Also, secrets file don't just contain a single password or passphrase. Sometimes, it's complete files with dozens of passwords in XML... (and this is out of our control). That makes both extVar and tlaVar impracticable.

@ant31
Copy link

ant31 commented Jun 11, 2020

Trying to hide Jsonnet from the users sounds is a bit of a boundary case from language design perspective. Are you sure importstr is not acceptable? You don't need to wrap it in parseJson in that part of code, so it's just a matter of importstr.

Not that much, it actually joins @mbarbero case, because you could externalise this configuration/user input in a yaml :

values.yaml:

config_file: ./config/myconfig.json
username: me
password: bla

manifest.jsonnet

{
values: parseYaml(importstr 'values.yaml')
config: importDyn(values.config_file)
}

@sbarzowski
Copy link
Collaborator

You don't need to pass every single secret as a separate tla/extVar – that of course could get quickly out of hand. I meant passing all secrets at once in one structure file. E.g. you could have a secrets.json which contains all secrets (or it could contain XMLs which contains the actual if that's your thing).

If you already have a directory structure with all these secrets, you can serialize them all to JSON with a tool such as dir2json (https://github.com/sbarzowski/dir2json).

# I imagine it could be the last step in the "imperative stage"
dir2json generated_secrets/ > secrets.json
jsonnet --ext-var secrets=secrets.json foo.jsonnet

The main disadvantage of this approach is having to copy all of these secrets to one place ahead of time, whether or not they're going to actually be used. It also doesn't work well if your directory structure has symlinks etc. This is something you can use here and now, independently from what we decide here.

@sbarzowski
Copy link
Collaborator

What do you think about providing --tla-var-dir/--ext-var-dir which would provide an object similar to what you would get with dir2json but built more efficiently and without the need of external tool and temporary files. Perhaps it could even load the files lazily. It would load all files inside as strings, so it wouldn't allow running Jsonnet files from inside.

How useful do you think that would be for use cases like yours?

@sbarzowski
Copy link
Collaborator

@sparkprime Since we got back to this discussion, perhaps you have some thoughts?

@mbarbero
Copy link
Author

What do you think about providing --tla-var-dir/--ext-var-dir which would provide an object similar to what you would get with dir2json but built more efficiently and without the need of external tool and temporary files. Perhaps it could even load the files lazily. It would load all files inside as strings, so it wouldn't allow running Jsonnet files from inside.

How useful do you think that would be for use cases like yours?

@sbarzowski Sorry for late reply. Yes, this solution would be workable. As long as it's supported natively by jsonnet (without having to call the intermediate dir2json binary), I think it's a decent alternative to computed importstr in my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants